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Jet substructure is typically studied using clustering algorithms, such as kr, which arrange the 
jets' constituents into trees. Instead of considering a single tree per jet, we propose that multiple 
trees should be considered, weighted by an appropriate metric. Then each jet in each event produces 
a distribution for an observable, rather than a single value. Advantages of this approach include: 
1) observables have significantly increased statistical stability; and, 2) new observables, such as the 
variance of the distribution, provide new handles for signal and background discrimination. For 
example, we find that employing a set of trees substantially reduces the observed fluctuations in the 
pruned mass distribution, enhancing the likelihood of new particle discovery for a given integrated 
luminosity. Furthermore, the resulting pruned mass distributions for (background) QCD jets are 
found to be substantially wider than that for (signal) jets with intrinsic mass scales, e.g. boosted 
W jets. A cut on this width yields a substantial enhancement in significance relative to a cut on 
the standard pruned jet mass alone. In particular the luminosity needed for a given significance 
requirement decreases by a factor of two relative to standard pruning. 



To develop intuition about high-energy collisions like 
those at the LHC it is often helpful to think of an event as 
being produced by a multi-stage process. In this picture, 
a short distance scattering produces a few hard partons. 
The partons then shower soft and collinear QCD radi- 
ation. Finally, at long distances, the (colored) partons 
bind into the (color singlet) hadrons that we observe in 
the detector. This parton-shower picture explains how 
clusters of nearby final-state particles, called jets, de- 
fined by a jet algorithm, can reveal something about the 
short-distance physics. Simulations of the parton shower 
produce events which, with sufficient tuning, exhibit re- 
markable agreement with collider data for nearly any con- 
ceivable infrared safe observable. 

If one takes the parton-shower picture literally, the 
constituents of a jet arise from a shower-like series of 
1 — > 2 splittings producing a "tree" structure. Since the 
shower model for QCD is dominated by soft and collinear 
splittings, any deviation from this behavior could indi- 
cate the presence of contamination within the jet, or 
might indicate that the jet is not purely of QCD ori- 
gin (e.g., it could come from a boosted heavy particle). 
Thus, by associating trees (by "trees," we mean "cluster- 
ing histories") to jets one can obtain useful information, 
and indeed this is the basis for much of the work in the 
field of jet substructure (see Ref. [T| for a review). 

The association of a tree to a jet naturally emerges 
from the parton-shower picture. In the parton shower, 
soft and collinear radiation is emitted in a particular se- 
quence: a px-ordered shower builds a tree by adding on 
emissions in decreasing order of transverse momentum, 
while an angular ordered shower adds emissions in a se- 
quence of decreasing angle. The recombination jet algo- 
rithms try to match this behavior. The kx algorithm [2 
assembles a jet in increasing order of the (relative) kx 



metric that depends on both angle and the magnitude of 
the momentum, and the Cambridge/ Aachen (C/A) al- 
gorithm [5] assembles in increasing order of angle. Both 
can be viewed as a reasonable guess for the showering 
sequence history. 

One problem with thinking of jet algorithms as revers- 
ing the parton shower is that the parton shower is not 
invertible - a given set of four-momenta of final state 
particles could have evolved through a multitude of in- 
termediate trees. In this paper we propose a way to ac- 
count for the non-invertible nature of the parton shower 
by associating to each jet a set of trees instead of a single 
tree. 

Related ideas have been discussed in the past. Long 
ago a probabilistic approach was used to improve the be- 
havior of seeded jet algorithms [4]. More recently, it has 
been shown that combining even highly correlated ob- 
servables, such as jet masses arising from different groom- 
ing techniques [5J, can improve discovery significance. In 
addition, Ref. [6] considered associating multiple trees to 
a jet to compare with models of showering in signal and 
background processes, and Ref. [7] proposed a measure 
of jet fuzziness to gauge the ambiguity in jet reconstruc- 
tion. However, our approach is fundamentally different 
from these previous studies. We are interested in observ- 
ables constructed from a distribution of trees for each jet 
in each event. For instance, we will show that by aver- 
aging tree-based observables over the trees for each jet, 
their statistical stability can be substantially improved. 

Associating a set of trees to a jet would not be fea- 
sible if one had to consider every tree which could be 
formed from a given set of final state four-momenta in a 
jet. Fortunately, good approximations to such distribu- 
tions obtained using every tree can be captured through 
a procedure analogous to Monte-Carlo integration, allow- 
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ing us to use a very small fraction of the trees. This is 
possible beause infrared and collinear safe jet observables 
must be insensitive to small reshufflings of the momenta, 
implying that large classes of trees give very similar in- 
formation. 

The algorithm we propose assembles a tree via a series 
of 2 — ¥ 1 mergings: 

1. At every stage of clustering, a set of weights u>ij for 
all pairs (ij) of the four-vectors is computed, and a 



probability £7^ = ui, 
assigned to each pair. 



i/N, where N = £ 



(ij) ^ij, 



is 



2. A random number is generated and used to choose 
a pair (ij) with probability fiy. The chosen pair 
is merged, and the procedure is repeated until all 
particles all clustered. 

This algorithm directly produces trees distributed ac- 
cording to their weight Ilmergmgs ■ ^° P r °duce a dis- 
tribution of trees for each jet, this algorithm is simply 
repeated AT troc times (not necessarily yielding N tTee dis- 
tinct trees). Note that any algorithm which modifies a 
tree during its construction (e.g., jet pruning) can be 
adapted to work with this procedure, as demonstrated 
below. 

One particularly interesting class of weights is given by 



,(«) 



exp 



(di 



(1) 



with a a real number we call rigidity. Here, is the jet 
distance measure for the (ij) pair, e.g., 



d kT = min{p^,p^}Ai?2 
d c/A = ARf 3 



(2) 



where AR% 



Ay? + A</>?-, and d min is the minimum 



4 + A $V 

over all pairs at this stage in the clustering. Note that 
with this metric, our algorithm reduces to a traditional 
clustering algorithm when a — ¥ oo, i.e., in that limit 
the minimal dij is always chosen. In this sense, it is 
helpful to think of the traditional, single tree algorithm 
as the "classical" approach, with a ~ 1/H controlling the 
deviation from the "classical" clustering behavior. With 
this analogy, we call the trees constructed in this non- 
deterministic fashion Q-jets ("quantum" jets). 

In order to get the most information out of the Q-jets, 
it is logical to consider observables which are sensitive 
to the ordering of the clusterings in the tree. One such 
observable is the pruned jet mass, which we will use as 
our illustrative example. As described in Ref. [S] prun- 
ing is one of the jet grooming tools [9]. It is used to 
sharpen signal and reduce background when considering 
boosted heavy objects. The basic idea is to move along 
the tree and try to discard radiation which is soft and not 
collinear, and therefore likely to represent contamination 
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FIG. 1. Distribution of pruned jet mass for a single QCD- 
jet with pt ~ 500 GeV. The black and red solid lines show 
the classical pruned masses when C/A and kr algorithms are 
used to cluster the jet. The black and dashed (red and dot- 
dashed) line shows the pruned jet mass distribution of 1000 
trees (constructed from the same jet in the same event), when 
the C/A (kx) measure is used in Eq. These distributions 
result from clusterings with rigidity a = 1.0 (top) and a = 
0.01 (bottom). 



from a part of the event in which we are not particularly 
interested (like the underlying event) . In detail, if a step 
in the clustering would merge particles i and j which 
satisfy 



_ mm [p Tl , PTj , 

\pTi +PT j \ 

ARi~ > D cut , 



< z cut 



and 



(3) 



then the merging is vetoed and the softer of the two four- 
momenta is discarded. In the specific analysis described 
here we take z cu t = 0.1 and D cut = mjet/pjet, which are 
typical cuts for the C/A algorithm. 

We apply this pruned Q-jets procedure to samples of 
simulated boosted W (signal) and QCD (background) 
jets generated with Pythia v6.422 [TU] with p^-ordered 
showers using the Perugia 2011 tunes [TT] and assuming 
a 7 TcV LHC. In lieu of detector simulation we group 
the visible output of Pythia into massless An x Acj) — 
0.1x0.1 "calorimeter cells" (with \n\ < 5), preserving the 
energy and the direction to the cell. The cells with energy 
bigger than 0.5 GeV become the inputs to the initial jet- 
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finding algorithm (small alterations to this cut have no 
appreciable impact on our results). To find the initial 
jets we use the anti-kr algorithm [12] with R = 0.7 as 
implemented in Fast jet v2.4.2 |13) and require > 
500 GeV. Once a jet is identified, the cells clustered in the 
jet become input to the Qjet-pruning algorithm. A fastjct 
plugin with this implementation of Q-jets is available at 
http : // j ets . physics . harvard . edu/Q j ets| 

Consider first a single QCD jet from the sample de- 
scribed above. Fig. [l] exhibits the pruned mass distribu- 
tion for this jet obtained with the classical procedure for 
both kx and C /A pruning (the 2 vertical lines) and with 
•^tree = 1000 using both the kx and C/A metrics for d^ 
m Eq. 0. The curves illustrate the dependence on the 
form of dij , as well as on the value of the rigidity param- 
eter a. The upper panel is for a — 1.0 where the trees 
are confined to stay close to the classical tree and the 
pruned masses likewise stay near the corresponding clas- 
sical result. For small enough a (say, a < 0.1), a broad 
spectrum of trees is sampled. This is shown in the lower 
panel of Fig. [T]for a = 0.01, where the distributions gen- 
erated with the kx and C/A definitions of the distance 
d^ look similar, and have little correspondence with the 
classical results. This suggests that for a small enough 
rigidity parameter pruned Q-jets become independent of 
the choice of distance measure used; they are therefore 
more likely to be characterizing physical features of an 
event rather than artifacts of using a particular jet algo- 
rithm. 

We will now discuss two fundamentally different ways 
in which the discovery potential (e.g. for finding boosted 
W jets on top of their QCD background) can be enhanced 
using Q-jets: 

• Observables have smaller statistical variation. 
Even for the same number of background jets, the 
use of Q-jets reduces the background fluctuations 
SB and increases the discovery potential S/SB, 
where S and B are the numbers of signal and back- 
ground jets in the signal window and SB denotes 
the fluctuation in B. 

• Qualitatively new observables, which depend on 
there being a distribution of trees for each jet, can 
now be considered. For example, we define below 
a powerful observable we call volatility which mea- 
sures the width of the pruned Q-jet mass distribu- 
tion for each jet, something inaccessible to a clas- 
sical jet algorithm 

To quantify the first of these points, we consider a large 
number of pseudo-experiments, each of which analyses 
Nj jets, with Nj taken from a Poisson distribution with 
mean (Nj). With a classical jet algorithm we can extract 
a significance by counting, in each pseudo-experiment, 
the number S and B, of W jets or QCD jets respec- 
tively, with pruned mass in a signal window, say between 





Vol. 

CUt (Vcut) 


Rigidity 

a = a = 0.01 a = 0.1 a = 1 a = 100 


(S)/SB\ Q 
{S)/SB\ cl 


None 
0.05 
0.04 
0.03 
0.02 


1.07(1) 1.13(1) 1.18(1) 1.14(1) 1.06(1) 
1.43(4) 1.44(3) 1.39(3) 1.27(1) 1.08(1) 
1.51(4) 1.45(4) 1.39(3) 1.29(3) 1.10(1) 
1.51(2) 1.45(3) 1.37(4) 1.35(2) 1.10(1) 
1.28(5) 1.24(3) 1.28(3) 1.36(3) 1.13(1) 


5(m)| c i 
<5(to)|q 


None 
0.05 
0.04 
0.03 
0.02 


1.32(2) 1.31(2) 1.25(2) 1.10(2) 1.03(1) 
0.80(1) 0.80(1) 0.81(1) 0.96(1) 1.01(1) 
0.62(3) 0.69(3) 0.71(2) 0.93(1) 1.00(1) 
0.56(4) 0.57(5) 0.60(4) 0.87(1) 0.98(1) 
0.48(7) 0.49(7) 0.50(7) 0.77(2) 0.95(1) 



TABLE I. The improvement found in various measurements 
performed using the Q-jet procedure compared to the classical 
pruning result, for a range of values of the rigidity parameter 
(a) and subject to a set of volatility cuts (V < V cu t). The 
first set of rows exhibit the discovery potential {S)/8B, while 
the second shows the average jet mass fluctuation 5{m). In 
both cases results greater than unity indicate improvement 
over the classical pruning procedure (see the text for further 
discussion). For all quantities, the approximate statistical 
uncertainty for the last digit is shown in parenthesis. 

70 — 90 GeV. The significance is then given by (S)/5B, 
where (S) is the average over the pseudo-experiments 
of the number of signal events in the window and SB is 
the RMS fluctuation of B over those pseudo-experiments. 
As expected (S) and (B) are proportional to (Nj), while 
SS and SB vary with yj (Nj). In addition to looking at 
(S) /SB, we can also look at the RMS fluctuations in the 
average pruned Q-jet mass of the signal jets, S(m), av- 
eraged over the signal jets in the signal window for each 
pseudo-experiment. This tells us the statistical uncer- 
tainty with which the W mass could be measured from 
these events. 

With Q-jets, we can do something more sophisticated. 
Instead of the contribution of a given jet to S or B being 
1 or depending on whether the pruned mass is in the 
signal window or not, the contribution of the jet is now 
a rational number between and 1, given by the fraction 
of the N tlee pruned masses that fall in the signal mass 
window. This is a way of reducing the contribution from 
events which are less signal like, without discarding them 
completely. In the limit a — >■ oo, this reduces to the clas- 
sical measure, but for finite a, we expect an improvement 
in both significance and in S(m). 

For numerical analysis we use the C/A algorithm for 
both the classical and Q-jets cases and take A^-cc = 256. 
(We find that the results saturate for N tTCC > 50). We 
present results in Table |T] as ratios of the Q-jets result 
to the classical result, indicating the improvement in sig- 
nificance and mass uncertainty we can expect. These 
ratios should be independent of (Nj) and so we de- 
termine statistical uncertainties by fitting to results for 
(Nj) = 5, 10, 15 and 20. The approximate statistical un- 
certainties are shown in parenthesis and apply to the last 
digit. We performed 10 pseudo-experiments, expecting 
0(1%) statistical fluctuations from this procedure. 

The first set of rows in Table |T] display measurements 
of the discovery potential (S)/SB compared to the re- 
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FIG. 2. Distribution of volatility for signal (boosted W-jets) 
and background (QCD jets) using a rigidity a = 0.01. 

suits with classical pruning. Focus on the rows labeled 
"none" for now (volatility is explained below) . Since this 
quantity scales as VC, the square of the number in the 
Table can be interpreted as an effective luminosity im- 
provement due to employing the Q-jet procedure. For 
example, for a = 0.1 the number 1.18 means an effective 
increase in the luminosity by (1.18) 2 — 1 = 39%. Larger 
a values confine the range of trees and yield results very 
near the classical pruning results, i.e., jb\q — > ^p- 1 cl . 
Smaller a values (a < 0.1, with a much broader range 
of trees) also tend to degrade (decrease) the discovery 
potential. 

The second set of rows exhibit the average jet mass 
fluctuation (note classical over Q-jets here). Val- 

ues greater than unity mean that the mass can be mea- 
sured more precisely with the Q-jet procedure for the 
same luminosity. Note that there is continuing improve- 
ment in S(m) as a decreases. That we get sensible results 
for (i.e. with a flat distance measure) is presumably be- 
cause pruning is relatively insensitive to which tree we as- 
sign; even for physically unlikely clusterings, the hard ra- 
diation that reconstructs the mass is typically not pruned 
away. 

The second way we have considered using Q-jets is in 
constructing qualitatively new types of observables. As 
an example, consider the volatility of a jet, defined by 

V = r/(m) , (4) 

where T = \J (to 2 ) — (to) 2 and (to) are the RMS devi- 
ation and the mean of the pruned jet mass distribution 
for a single jet. The distribution of volatility for signal 
and background Q-jets with a = 0.01 is shown in Fig. [2] 
We see that W jets have a lower volatility than QCD 
jets. This is easily understood, since the W jets have an 
intrinsic physical mass scale, while the QCD jets do not. 
Cutting on volatility, V < V cu t can therefore improve 
significance in a boosted W search. The improvement is 
given in Table 1 for various values of V cu t • 

The efficiencies for a volatility cut on signal and back- 
ground are shown in Fig. [3] These efficiencies are defined 
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FIG. 3. The background versus signal efficiencies correspond- 
ing to a cut on volatility, for various a's, as compared to the 
classical pruning result. 

as the fraction of the Q-jets that yield a pruned mass in 
the mass bin after the volatility cut. We plot them nor- 
malized to the classical results (a = oo with no volatility 
cut). In the limit a — > oo the curve collapses to the point 
(1,1). The upper right region of the plot corresponds to 
large values of V cu t, i.e., effectively no volatility cut. We 
find that the largest signal significance is obtained for 
a volatility cut of approximately 0.03, where for a near 
zero we achieve a relative (S)/(B) of ~ 9 and a relative 
(S)/SB improvement of ~ 1.5 (the square of this num- 
ber is the factor of two quoted in the Abstract). This 
corresponds to the neighborhood of the point (0.25, 0.03) 
in Fig. [3| Finally we note that the precision of the mass 
measurement, shown in the lower rows in the table, is 
somewhat degraded by placing a cut on the volatility. 
This should not be a surprise as the cut discards some of 
the signal jets. A more comprehensive discussion of the 
statistics and of volatility will be given in [14] . 

In this paper, we have shown that it can be advan- 
tageous to consider a large number of trees constructed 
from the same jet in a single event, rather than a single 
tree as is done in traditional clustering algorithms. Al- 
though this paper has focused on tree-based observables, 
the Q-jets idea, of using non-determinism in event analy- 
sis, can naturally be applied in many other ways. Indeed, 
most observables, including jet substructure observables, 
such as jet masses, moments, pull [J5], jet shapes [To] . 
etc., as well as more global observables, such as the num- 
ber, distribution and 4-momenta for the jets in an event, 
work by trying to make the best guess at which prop- 
erties of which final state particles tell us the most in- 
formation about the underlying physics. The basic idea 
for Q-jets is that there is an inherent ambiguity in this 
best guess, both due to there not being a precise corre- 
spondence between final state particles and underlying 
physics, and due to our poor ability to extract that cor- 
respondence even if it were well-defined (as in a color 
singlet decay, for example). Thus, it would be natural 
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to consider multiple interpretations of any observable, to 
see whether getting away from the best guess can give us 
more robust information about the underlying physics, as 
it has with the tree-based substructure considered here. 
In will be interesting to see in future work how far this 
non-deterministic approach can be pushed. 
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